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DETAILED ACTION 

This is the initial response to the application filled October 15, 2003. Claims 1-19 are 
pending and are considered below. 



Claim Rejections - 35 USC §112 

The following is a quotation of the first paragraph of 35 U.S.C. 112: 

The specification shall contain a written description of the invention, and of the manner and process of 
making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the 
art to which it pertains, or with which it is most nearly connected, to make and use the same and shall 
set forth the best mode contemplated by the inventor of carrying out his invention. 

Claims 1,8,9 and 19 are rejected under 35 U.S.C. 112, first paragraph, as failing 
to comply with the enablement requirement. The claim(s) contains subject matter which 
was not described in the specification in such a way as to enable one skilled in the art to 
which it pertains, or with which it is most nearly connected, to make and/or use the 
invention. 

Claim 1 recites "a first stage input and a second stage input", however no support 
for this is found in the specification. The specification supports a first stage input, 
however no description is provided for a second stage input. Claim 19 is rejected for 
similar reasons. 

Claims 8 recites translating by means of "speech recognition of the first stage 
input and the second stage input", however there is no support for this found in the 
specification. The specification supports recognition of speech on a first stage input, 
however no description is provided for a second stage input or speech recognition 
performed on that second stage input. Claim 9 is rejected for similar reasons. 
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The examiner interprets a first and second stage input as two inputs, each input 
received from a different peripheral device. This interpretation is used throughout the 
remainder of this office action. 



Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill m the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in Which the invention was made. 

Claims 1,2,5-8,10-15, 17 and 18 are rejected under 35 U.S.C. 103(a) as being 

unpatentable over Greene (6,377,925) in view of Lemlouma ("NAC: A Basic Core for 

the Adaptation and Negotiation of Mulitmedia Services" OPERA Project, 2001). 

As per claim 1 , Greene disclose a method of input conversion to output, said 
conversion method comprising: entering a first stage input and a second stage input 
(column 4 lines 46-52, multiple inputs from multiple peripheral devices are received)] 
translating the first stage input and the second stage input into an electronic format 
(column 4 line 65- column 5 line 9, inputs are received through a microphone or 
keyboard, which turns the analog input into an electronic signal which is processed by 
the computer). However, Greene does not disclose converting the electronic format into 
standard XML encoded with accessibility information, transforming the standard XML 
encoded with accessibility information into individual version of XML dependent on 
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desired output, and utilizing a rendering engine to modify the individual version of XML 
into a format for the output. Lemlouma discloses converting the electronic format into 
standard XML encoded with accessibility information (Section 3 Architecture and 
section 4.2 The UCM: User Context Module, documents are saved in a server using 
XML format and a user profile, which dictates user preferences for data presentation, 
are stored as an XML file), converting the XML encoded with accessibility information 
into individual version of XML dependent on desired output (section 2 Context of the 
Work, the system provides multimedia to the user dependent upon a target service 
format and their desired context or device), and utilizing a rendering engine to modify 
the individual version of XML into a format for the output (section 2 Context of the Work, 
the system provides multimedia to the user dependent upon a target service format and 
their desired context or device). Lemlouma discloses a system that provides a basic 
core to adapt multimedia information dependent upon a user specified output (I 
Introduction, last paragraph). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to encode the inputs as XML, then convert these to specific outputs 
depending on a desired output in Greene, since XML provides a good separation 
between data and presentation, which makes adaptation of multimedia information to 
different platforms easy, as indicated in Lemlouma (Section I Introduction, last 
paragraph and section 2 Context of the Work, first paragraph). 
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As per claim 2, Greene in view of Lemlouma disclose the conversion method of claim 
1 , and Greene further discloses wherein said entering step includes speaking (column 4 
lines 65-67). 

As per claim 5, Greene in view of Lemlouma disclose the conversion method of claim 
1, and Greene further discloses wherein said entering step includes by a peripheral 
(column 4 lines 46-52). 

As per claim 6, Greene in view of Lemlouma disclose the conversion method of claim 
1 , and Greene further discloses wherein said entering step includes semantic 
information (column 4 line 65-67, speech). 

As per claim 7, Greene in view of Lemlouma disclose the conversion method of claim 
1, and Greene further discloses wherein said entering step includes providing format 
and structure instructions (column 6 lines 10-12, the user can choose to enable or 
disable any of the translation functions, which dictate the output method of the 
translated input). 
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As per claim 8, Greene in view of Lemlouma disclose the conversion method of claim 
2, however Greene does not explicitly disclose wherein said translating step includes 
the use of automatic speech recognition of the first stage input and the second stage 
input to produce the electronic format. Greene does disclose a system that receives 
speech inputs and converts it into text, or other formats (column 4 line 65 - column 5 
line 9), therefore it is inherent that speech recognition is used. 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have a translating step include automatic speech recognition at a first 
and second stage step in Greene and Lemlouma, since a two stage input would 
provide parallel processing of speech which enables increased speech processing, 
specifically for more than one speaker, and enables a real-time implementation. 

As per claim 10, Greene in view of Lemlouma discloses the conversion method of 
claim 1 , however Greene does not disclose wherein the individual version of XML is 
XHTML for desired output. Lemlouma discloses wherein the individual version of XML 
is XHTML for desired output (section 2 Context of the Work, the system provides 
multimedia to the user dependent upon a target service format and their desired context 
or device, one of the target service formats including XHTML). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the individual version of XML as XHTML in Greene and 
Lemlouma, since XHTML formatting instructions can be easily obtained, thus removing 
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the need to devote time and resources to develop a new method to represent image 
and text information. 

As per claim 1 1 , Greene in view of Lemlouma discloses the conversion method of 
claim 1 , however Greene does not disclose wherein the individual version of XML is 
VoiceXML for desired output. Lemlouma discloses wherein the individual version of 
XML is VoiceXML for desired output (section 2 Context of the Work, the system 
provides multimedia to the user dependent upon a target service format and their 
desired context or device, one of the target service formats including VoiceXML). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention use the individual version of XML as VoiceXML in Greene, since 
VoiceXML formatting instructions can be easily obtained, thus removing the need to 
devote time and resources to develop a new methods to represent speech information. 

As per claim 12, Greene in view of Lemlouma discloses the conversion method of 
claim 1 , however Greene does not disclose wherein the individual version of XML is a 
custom XML for desired output. Lemlouma disclose that the individual version of XML 
is a custom XML for desired output (section 2 Context of the Work, the system provides 
multimedia information to the user dependent upon a target service format and their 
desired context or device). 
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Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the individual version on XML as a custom XML in Greene, since 
it would enable the system to easily represent information structured according to a user 
request without the need to devote time and resources to a new method to represent 
image and text information. 

As per claim 13, Greene in view of Lemlouma disclose the conversion method of claim 

10, and Greene further discloses wherein the desired output is text (column 5 lines 2-3). 

As per claim 14, Greene in view of Lemlouma disclose the conversion method of claim 

1 1 , and Greene further discloses wherein the desired output is synthesized speech 
(column 7 lines 53-55). 

As per claim 15, Greene in view of Lemlouma disclose the conversion method of claim 

12, and Greene further discloses wherein the desired output is virtual sign language 
(column 6 lines 31-33). 

As per claim 17, Greene in view of Lemlouma discloses the conversion method of 
claim 12, however Greene does not explicitly disclose wherein the desired output is 
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electronic large print. Greene does disclose an electronic translator to assist a person 
with disabilities, which outputs information as text (Abstract). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to output electronic large print in Greene, since it would provide 
enhanced functionality for a user who is visually impaired. 

As per claim 18, Greene in view of Lemlouma disclose the conversion method of claim 
1 , however Greene does not disclose wherein said transforming step includes the use 
of extensible stylesheet language transformations to transform the standard XML 
encoded with accessibility information into the individual version of XML dependent on 
desired output. Lemlouma discloses wherein said transforming step includes the use of 
extensible stylesheet language transformations to transform the standard XML encoded 
with accessibility information into the individual version of XML dependent on desired 
output (section 1 Introduction, last paragraph, and section 2 Context of the Work, 
second paragraph, adaptations to multimedia information are performed by 
transformation methods using XSLT; those adaptations dependent upon the target 
service format). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use extensible stylesheet language transformations to transform the 
standard XML encoded with accessibility information into the individual version of XML 
dependent on desired output in Greene, since XSLT stylesheets can be easily obtained, 
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thus removing the need to devote time and resources to develop a new method of 
transformation. 

Claims 4 and 16 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Greene in view of Lemlouma as applied to claims 1 ,3,10-12 above, and further in view 
of B/affner ("Multimodal Integration" IEEE 1996). 

As per claim 4, Greene in view of Lemlouma discloses the conversion method of claim 
1 , however neither disclose wherein said entering step includes writing. Blattner 
discloses a multimodal system, which is capable of integrating writing modalities (page 
20, section Agents, an agent based system creates agents for each modality, which 
processes its own signal; one of the modalities including writing). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have an entering step that includes writing in Greene and 
Lemlouma, since it would enable an alternate input for a person with a speech 
disability. 

As per claim 16, Greene in view of Lemlouma discloses the conversion method of 
claim 12, however neither disclose wherein the desired output is electronic Braille. 
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Blattner discloses a multimodal system that can output Braille (page 16, section 
Computer Output modalities, Bra/7/e is listed as a linguistic nonarbitrary output). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to include Braille as an output method in Greene and Lemlouma, since 
it would provide enhanced functionality for a user who is visually impaired. 

Claims 3 an 9 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Greene in view of Lemlouma as applied to claim 1 above, and further in view of 
Sakiyama (5,953,363). 

As per claim 3, Greene in view of Lemlouma discloses the conversion method of claim 
1, however neither disclose wherein said entering step includes gesturing. Sakiyama 
discloses a system that uses input gestures for sign language recognition (column 15 
line 6-20). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have an entering step include gestures in Greene and Lemlouma, 
since it would enable enhanced functionality for a person with a aural disability, thus 
improving communication as indicated in Sakiyama (column 1 lines 42-55). 

As per claim 9, Greene in view of Lemlouma disclose the conversion method of claim 
3, however neither disclose wherein said translating step includes the use of sign 
language recognition of the first stage input and the second stage input to produce the 
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electronic format. Sakiyama discloses a system that performs sign language 
recognition on input gestures (column 15 lines 6-20). In addition, Sakiyama discloses a 
system that receives multiple inputs (column 8 lines 26-40 and Figure 1a). 

Therefore it would have been obvious to one of ordinary skill in the art at 
the time of the invention to use sign language recognition as a translation step in 
Greene and Lemlouma, since it would enable enhanced functionality for a person with 
a aural disability, thus improving communication as indicated in Sakiyama (column 1 
lines 42-55). In addition, a two-stage input would provide parallel processing, which 
enables increased processing, specifically for more than one speaker, and enables a 
real-time implementation. 

Claim 19 is rejected under 35 U.S.C. 103(a) as being unpatentable over 
Sakiyama in view of Lemlouma and further in view of Zang (6,865,599). 

As per claim 19, Sakiyama disclose a method of input conversion to output, said 
conversion method comprising: entering a first stage input of speech (column 8 lines 27- 
30) and a second stage of gestures (column 15 lines 6-20), translating the speech input 
using automatic speech recognition into an electronic format (column 8 lines 27-40) and 
the gestures input using sign language recognition into an electronic format (column 15 
lines 6-20). However Sakiyama does not disclose converting the electronic format into 
standard XML encoded with accessibility information, transforming the standard XML 
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encoded with accessibility information using extensible stylesheet language 
transformations into XHTML for desired output of text, and utilizing a rendering engine 
of accessible instant messenger to modify the XHTML into the desired output of text. 
Lemlouma discloses converting the electronic format into standard XML encoded with 
accessibility information (Section 3 Architecture and section 4.2 The UCM: User Context 
Module, documents are saved in a server using XML format, and a user profile, which 
dictates user preferences for data presentation, are stored as an XML file), transforming 
the standard XML encoded with accessibility information using extensible stylesheet 
language transformations into XHTML for desired output of text (section 1 Introduction, 
last paragraph, and section 2 Context of the Work, second paragraph, adaptations to 
multimedia information are performed by transformation methods using extensible 
stylesheet language transformations (XSLT), those adaptations dependent upon the 
target service format, one of the target formats being XHTML), utilizing a rendering 
engine to modify the individual version of XML into a format for the output (section 2 
Context of the Work, the system provides multimedia to the user dependent upon a 
target service format and their desired context or device). Lemlouma does not explicitly 
disclose the using a rendering engine of accessible instant messenger to modify the 
XHTML into the desired output. However, Lemlouma does disclose a system for the 
adaptation of multimedia information between device with different service formats and 
multimedia representations (section 4 The NAC CORE, first paragraph). In addition, 
Zang discloses a system that uses an Instant Messenger to deliver XHTML data 
(column 29 lines 52-56). Zang discloses that Instant Messaging is a communications 
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format often used to represent media of different forms as formats (column 1 lines 30- 
33). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to convert the electronic format into XML with accessibility information, 
transform the XML using XSLT, then modify the XHTML to the desired output using an 
instant messenger in Greene, since XML provides a good separation between data and 
presentation, which makes adaptation of multimedia information to different platforms 
easy, as indicated in Lemlouma (Section I Introduction, last paragraph and section 2 
Context of the Work, first paragraph), and an Instant Messenger provides ready made 
software for the presentation of media of different formats, such as rich text and voice 
messages, as indicated in Zang (column 29 lines 50-51), thus removing the need to 
spend time and resources developing a method to deliver multimedia data. 

Conclusion 

The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

• Nguyen (6,072,494) discloses a system for gesture recognition. 

• Johnson (5,748,974) discloses a multimodal natural language interface. 

• Trabelsi ("A Voice and Ink XML Multiodal Architecture for Mobile e- 
commerce Systems" Workshop on Mobile Commerce 2002) discloses a 
multimodal interface for voice application. 
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• Wang ("Implementation of a Multimodal Dialog System Using Extended 
Markup Languages" Conference on Spoken Language Processing 2000) 
discloses a multimodal system develop using XML. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Dorothy Sarah Siedler whose telephone number is 571- 
270-1067. The examiner can normally be reached on Mon-Thur 9:30am-5:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Richemond Dorvil can be reached on 571-272-7602. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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